Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Note that path state cannot be removed without reception of PATH_ABANDON #475

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mirjak
Copy link
Collaborator

@mirjak mirjak commented Dec 11, 2024

fixes #471

Copy link
Contributor

@qdeconinck qdeconinck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@huitema huitema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this paragraph is necessary. I also think it is probably wrong. Suppose that I put a path in a "locally abandoned" state, and:

  • send a last ACK for the packets received so far,
  • consider all the packets sent so far and not acknowledged,
  • drop all incoming packets on that path without processing them.

I have at that point removed 99% of the path state -- the only resource that I need to keep is the path identifier, and I will only remove that when the ABANDON is received. But I have removed the bulk of the memory allocation: the copy of packets waiting for an ACK, the list of packets received from the peer that should be acked, probably the CID allocated for the path as well.

Yes, that will lead to packet loss, but so what? I have sent the ABANDON already, the peer should not be using the path, if it keeps doing so, too bad. There is never a guarantee that a packet will be received, and there is no protocol violation if the packets that were dropped are not acked.

Which points that [see next comment]

@mirjak
Copy link
Collaborator Author

mirjak commented Dec 16, 2024

@huitema looks like you comment above is half finished only...?

So what exactly is wrong about the current text?

I proposed this PR because the agreement in issue #471 that we wanted to note this issue. Please continue discussion there if you changed your mind and think we should not do anything.

@huitema
Copy link
Contributor

huitema commented Dec 17, 2024

The only recommendation that we should make is to not take any drastic action too soon. For example, it would be bad if application waited less than 3 PTO before closing the connection because they have not received a PATH_ABANDON yet.

draft-ietf-quic-multipath.md Outdated Show resolved Hide resolved
@mirjak
Copy link
Collaborator Author

mirjak commented Dec 17, 2024

I entered your proposed change which is fine for me. But based on your second comment, do you want to add anther sentence like:

"An endpoint SHOULD NOT remove path state earlier than 3 PTOs after sending the PATh_ABANDON frame".

?

@huitema
Copy link
Contributor

huitema commented Dec 17, 2024

Yes, please add the warning against removing state too soon.

draft-ietf-quic-multipath.md Outdated Show resolved Hide resolved
@mirjak mirjak requested review from huitema and qdeconinck December 23, 2024 14:13
Copy link
Contributor

@huitema huitema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add this paragraph. If we do, I would just keep the first sentence and the first words of the second. The alternative is a full page of text, maybe in the "implementation" section, explaining the various tradeoffs. But I have no appetite for that.

It is left to the implementation to handle this unexpected
behavior as it does not impact interoperability. However, an endpoint SHOULD NOT
remove path state earlier than 3 PTOs after sending the PATh_ABANDON frame.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What state are we speaking about? I think that we are saying either too much or too little. There many kinds of "states":

  1. Lists of packets that were sent on the path and not yet acked by the peer,
  2. Packets that were received from the peer and processed but not yet acked,
  3. the total number of paths considered active, which some implementation use to decide when to increase the Max Path ID.

Suppose that an implementation has abandoned a path, had not received a Path Abandon from the peer, and is losing patience. It may do a number of things:

  1. It may decide to drop any incoming packet on that path. That has no interoperability impact, as long as those dropped packets are never acknowledged. Indeed, that's exactly what would happen if the path had become bad and packets were dropped in the network. But it has some performance impact, since some packets will be in transit when the Path Abandon is sent. It is recommended to not do that earlier than 3 PTOs after sending the PATH_ABANDON frame.

  2. It may decide to consider all packets send on that path as lost, and resend the frames through another path. That's fine. It might cause some duplicate transmission, but that also happens for other reasons (e.g., bad timers) and has no interoperability impact. The rules there are murky, because this is a tradeoff -- resend sooner minimizes the end-to-end latency caused by losses, but resend sooner also increases the risk of duplicate transmission, which reduces efficiency. The 3PTO rule is probably one end of the tradeoff: good mitigation of duplicate transmission, poor management of latency. A 1PTO rule would be a different tradeoff, more focused on managing latency.

  3. It should send the "missing ACKs" on another path before clearing the missing ACK state. Again, that's an efficiency tradeoff. If the node does not send acknowledgements, it risks causing spurious retransmission.

  4. It may well decide to never increase the MAX_PATH_ID until the PATH_ABANDON has been received. That's a reasonable way for an implementation to conserve resource.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just say:

If a peer sends a PATH_ABANDON frame but never receives
a corresponding PATH_ABANDON frame, it might not be able to remove path state.
It is left to the implementation to handle this unexpected
behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What state are we speaking about? I think that we are saying either too much or too little. There many kinds of "states":

  1. Lists of packets that were sent on the path and not yet acked by the peer,
  2. Packets that were received from the peer and processed but not yet acked,
  3. the total number of paths considered active, which some implementation use to decide when to increase the Max Path ID.

Packet handling as mentioned in point 1 and 2 seems actually partly orthogonal here. If you have send the path abandon frame, you should probably not have any outstanding packets that need to be acked on that path. However, you should also never just forgot about outstanting/non-acked packets. If you decide to remove state about non-ack packets, you need to declare them lost and resent on another path. I guess if you really want to sent a path abandon frame even though you have outstanding packets, you should probably declare them lost and resent on another path right away.

About your point 2, I don't think there is an issue here and I'm not sure when this case could/would happen. If you have received packets, I guess you should first send the ACK and then the path abandon. Of course you can also send ACKs on other paths still.

The important part is about packet handling of new incoming packets and you should be able to process those for at least 3 PTOs after sending the path abandon frame. This means, you need to remember the path ID and corresponding CID(s) as well as the packet number space state. However, as soon as you remove the CID(s), you cannot process any new incoming packets anymore and as such you can then also remove all packet number space state as well for received packets. For out-going packets, it's really on you if you remove state even though there are ACKs missing; however, as I said above that means you have to retransmit on another paths.

The interesting case is actually your point 3. If you remove path state that means you remove the path ID and will consider it as not active anymore. As such you could send a new max path ID. However, as the other end is not sending you a path abandon frame it will consider the path ID as active and not increase its max path ID. Respectively, you might not be able to open a new path. But I guess that's what we have the path blocked frame for. So I guess there is no problem either way.

So should we maybe say this to be more clear:

However, an endpoint SHOULD process incoming packets for the abandoned Path ID and corresponding connection IDs for at least 3 PTOs after sending the PATh_ABANDON frame.

@huitema
Copy link
Contributor

huitema commented Dec 30, 2024

OK, I need to explain why I said first "Yes, please add the warning against removing state too soon", and then "just keep the first sentence and the first words of the second." I did indeed change my mind. That's because most of the issues about releasing state too soon are already discussed in the draft, in particular is 3.3.1. Avoiding Spurious Stateless Resets.

The "spurious stateless reset" issue is the main reason to be cautious with unilateral removal of state. Dropping packets is kind of bad, but will not crash the connection. Dropping the CID sent to the peer before receiving the PATH_ABANDON from the peer is worse, because of the risk of generating a spurious stateless reset and thus killing the connection.

So maybe we should say:

It is left to the implementation to handle this unexpected
behavior. Implementation MAY shed some of the path state and reduce
resource consumption if the PATH_ABANDON is
not arriving in a reasonable time, but they SHOULD
take steps and avoid sending spurious stateless reset (see {{spurious-stateless-reset}}).

@mirjak
Copy link
Collaborator Author

mirjak commented Jan 10, 2025

I'm not sure about your proposed text above, Christian. First it's very vague saying "shed some path state and reduce resource consumption" as well as "in a reasonable time". However, I don't think it addresses the issue. If you send a path_abandon frame you immediately retire all CIDs that you received from the other end which means you can't not sent on the path anymore. That avoids connection closure of spurious reset, as also mentioned in the paragraph just before his new proposed text, and it exactly what path_abandon stand for - it a clear promise that you won't sent on the path anymore.

The problem we are trying to address is about retiring the CIDs that you provided to the peer. Because those you are expected to only retire when you receive a path_abandon from the other end. And here I think we should be clear that there is also a recommend minimal time of 3 PTO before retiring the CID that were issues to the other end. But I think we could be more concrete in the proposed text and not talk about "path state" but more specifically about "issued CIDs". Would that help (see proposed commit)?

a corresponding PATH_ABANDON frame, it might not be able to remove path state.
It is left to the implementation to handle this unexpected
behavior as it does not impact interoperability. However, an endpoint SHOULD NOT
remove path state earlier than 3 PTOs after sending the PATh_ABANDON frame.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
remove path state earlier than 3 PTOs after sending the PATh_ABANDON frame.
remove issued CIDs earlier than 3 PTOs after sending the PATh_ABANDON frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

If the peer does not reply with a path_abandon frame, state can never be removed
3 participants